A Real-time System Detecting Filled Pauses in Spontaneous Speech
نویسندگان
چکیده
This paper describes a method for detecting filled pauses (including word lengthening), which are one of the hesitation phenomena. This detection is important in speech dialogue systems because they play valuable roles in oral communication. Although there have been a few previous speech recognition systems handling filled pauses, they have not detected them individually and consequently could not consider their roles. Our method can detect filled pauses by finding small fundamental frequency transition and small spectral envelope deformation under the assumption that articulator parameters do not change during filled pauses. Experimental results for a Japanese spoken dialogue corpus show that our system yielded a recall rate of 84.9% and a precision rate of 91.5%.
منابع مشابه
A real-time filled pause detection system for spontaneous speech recognition
This paper describes a method for automatically detecting filled (vocalized) pauses, which are one of the hesitation phenomena that current speech recognizers typically cannot handle. The detection of these pauses is important in spontaneous speech dialogue systems because they play valuable roles, such as helping a speaker keep a conversational turn, in oral communication. Although a few speec...
متن کاملDetecting Filled Pauses in Tutorial Dialogs
As dialog systems become more capable, users tend to talk more spontaneously and less formally. Spontaneous speech includes features which convey information about the user’s state. In particular, filled pauses, such as um and uh, can indicate that the user is having trouble, wants more time, wants to hold the floor, or is uncertain. In this paper we present a first study of the acoustic charac...
متن کاملSynthesising Filled Pauses: Representation and Datamixing
Filled pauses occur frequently in spontaneous human speech, yet modern text-to-speech synthesis systems rarely model these disfluencies overtly, and consequently they do not output convincing synthetic filled pauses. This paper presents a text-to-speech system that is specifically designed to model these particular disfluencies more efffectively. A preparatory investigation shows that a synthet...
متن کاملA Feature-based Filled Pause Detection System for Dutch
Nowadays, automatic speech recognizers have become quite good in recognizing well prepared fluent speech (e.g. news readings). However, the recognition of unprepared or spontaneous speech is still problematic. Some important reasons for this are that spontaneous speech is less articulated, exhibits a high speaking rate and usually contains a lot of disfluencies. The latter occur when the speake...
متن کاملDetecting laughter and filled pauses using syllable-based features
Identifying laughter and filled pauses is important to understanding spontaneous human speech. These are two common vocal expressions that are non-lexical and incredibly communicative. In this paper, we use a two-tiered system for identifying laughter and filled pauses. We first generate frame level hypotheses and subsequently rescore these based on features derived from acoustic syllable segme...
متن کامل